The Automatic Translation of Discourse Structures

نویسندگان

  • Daniel Marcu
  • Lynn Carlson
  • Maki Watanabe
چکیده

We empirically show that there are significant differences between the discourse structure of Japanese texts and the discourse structure of their corresponding English translations. To improve translation quality, we propose a computational model for rewriting discourse structures. When we train our model on a parallel corpus of manually built Japanese and English discourse structure trees, we learn to rewrite Japanese trees as trees that are closer to the natural English rendering than the original ones. 1 Motivation Almost all current MT systems process text one sentence at a time. Because of this limited focus, MT systems cannot regroup and reorder the clauses and sentences of an input text to achieve the most natural rendering in a target language. Yet, even between languages as close as English and French, there is a 10% mismatch in number of sentences-what is said in two sentences in one language is said in only one, or in three, in the other (Gale and Church, 1993). For distant language pairs, such as Japanese and English, the differences are more significant. Consider, for example, Japanese sentence (1), a word-byword "gloss" of it (2), and a two-sentence translation of it that was produced by a professional translator (3). [The Ministry of Health and Welfare last year revealed I ] [population of future estimate according to 2] [in future 1.499 persons as the lowest s] [that after *SAB* rising to turn that 4] [*they* estimated but s ] [already the estimate misses a point ~] [prediction became. 7] (2) [In its future population estimates'] [made (3) public last year, 2] [the Ministry of Health and Welfare predicted that the SAB would drop to a new low of 1.499 in the future, s) [but would make a comeback after that, 4] [increasing once again, s ] [However, it looks as if that prediction will be quickly shattered. 6] The labeled spans of text represent elementary discourse units (edus), i.e., minimal text spans that have an unambiguous discourse function (Mann and Thompson, 1988). If we analyze the text fragments closely, we will notice that in translating sentence (1), a professional translator chose to realize the information in Japanese unit 2 first (unit 2 in text (1) corresponds roughly to unit 1 in text (3)); to realize then some of the information in Japanese unit 1 (part of unit 1 in text (1) corresponds to unit 2 in text (3)); to fuse …

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Seeking Source Discourse Ideology by English and Persian Translators: A Comparative Think Aloud Protocol Study

Discourse audiences are susceptible to fall victims of the concealed ideological representations in discourses at the expanse of changing and modifying their mental models through which they act on the world. Translators as readers and at the same time intercultural mediators need to be equipped with the knowledge of how ideology is accommodated in discourse both not to fall victim to it and to...

متن کامل

Discovery of Discourse-Related Language Contrasts through Alignment Discrepancies in English-German Translation

In this paper, we analyse alignment discrepancies for discourse structures in English-German parallel data – sentence pairs, in which discourse structures in target or source texts have no alignment in the corresponding parallel sentences. The discourse-related structures are designed in form of linguistic patterns based on the information delivered by automatic part-of-speech and dependency an...

متن کامل

Strategies Used in Translation of Comedies with Emphasis on Politeness

The present study sought to investigate the translation strategies in an American sitcom in Iranian EFLclasses with emphasis on politeness. The participants were 50 male and female Iranian undergraduateB.A. and M.A. students majoring in English Translation, and English language teaching at the IslamicAzad University, North Tehran. The participants were administered three tests. A multiple choic...

متن کامل

Transmission of Ideology through Translation: A Critical Discourse Analysis of Chomsky’s “Media Control” and its Persian Translations

Among factors that might manipulate translators’ mind while producing a text is the notion of ideology transmission through text or talk. Adopting Critical Discourse Analysis (CDA) with particular emphasis on the framework of Van Dijk (1999), the present investigation is an attempt to shed light on the relationship between language and ideology involved in translation in general, and more speci...

متن کامل

Iranian Advanced EFL Learners’ Awareness and the Use of Marked Word Order: Discourse-pragmatically Motivated Variations

The present investigation was designed to study the production and comprehension of specific means for information highlighted by advanced Iranian learners of English as a Foreign Language. The study focused on the discourse-pragmatically motivated variations of the basic word order such as inversion, pre-posing, it- and Wh-clefts. After taking the Nelson test, a homogeneous group was settled. ...

متن کامل

Manipulation As an Ideological Tool in the Persian Translations of Ervand Abrahamian’s The Coup: A Multimodal CDA Approach

The present Critical Discourse Analysis (CDA) study aimed to explore the probable ideological manipu- lations exerted in three translations of an English political book entitled The Coup by Ervand Abraha- mian. This comparative qualitative study was conducted based on Farahzad‘s three-dimensional CDA model. The textual, paratextual, and ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000